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Abstract. We have previously introduced role logic as a notation for 
describing properties of relational structures in shape analysis, databases 
and knowledge bases. A natural fragment of role logic corresponds to 
two-variable logic with counting and is therefore decidable. 

We show how to use role logic to describe open and closed records, as 
well the dual of records, inverse records. We observe that the spatial 
conjunction operation of separation logic naturally models record con- 
catenation. Moreover, we show how to eliminate the spatial conjunction 
of formulas of quantifier depth one in first-order logic with counting. As 
a result, allowing spatial conjunction of formulas of quantifier depth one 
preserves the decidability of two-variable logic with counting. This result 
applies to two-variable role logic fragment as well. 

The resulting logic smoothly integrates type system and predicate cal- 
culus notation and can be viewed as a natural generalization of the no- 
tation for constraints arising in role analysis and similar shape analysis 
approaches. 
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1 Introduction 


In [36] we have introduced role logic, a notation for describing properties of 
relational structures in shape analysis, databases and knowledge bases. Role 
logic notation aims to combine the simplicity of role declarations [33] and the 
well-established first-order logic. Role logic is closed under all boolean operations 
and generalizes boolean shape analysis constraints [37]. Role logic formulas easily 
translate into the traditional first-order logic notation. Despite this generality, 
role logic enables the concise expression of common properties of data structures 
in imperative programs that manipulate complex data structures with mutable 
references. In [36, Section 4] we have established the decidability of the fragment 
RL? of role logic by exhibiting a correspondence with two-variable logic with 
counting O? [22,45]. 

Generalized records in role logic. In this paper we give a systematic 
account of field and slot declarations of role analysis [33] by introducing a set of 
role logic shorthands that allows concise description of records. Our basic idea 
is to generalize types to unary predicates on objects. Some of the aspects of our 
notion of records that indicate its generality are: 


1. We allow building new records by taking the conjunction, disjunction, or 
negation of records. 

2. In our notation, a record indicates a property of an object at a particular 
program point; objects can satisfy different record specifications at differ- 
ent program points. As a result, our records can express typestate changes 
such as object initialization [16—-18,55,56] and more general changes in rela- 
tionships between objects such as movements of objects between data struc- 
tures [32, 33, 54]. 

3. We allow inverse records as a dual of records that specify incoming edges of 
an object in the graph of objects representing program heap. Inverse records 
allow the specification of aliasing properties of objects, generalizing unique 
pointers. Inverse records enable the convenient specification of movements 
of objects that participate in multiple data structures. 

4. We allow the specification of both open and closed records. Closed records 
specify a complete set of outgoing and incoming edges of an object. Open 
records leave certain edges unspecified, which allows orthogonal data struc- 
tures to be specified independently and then combined using logical conjunc- 
tion. 

5. We allow the concatenation of generalized records using a form of spatial 
conjunction of separation logic, while remaining within the decidable frag- 
ment of two-variable role logic. 


Separation logic. Separation logic [28, 43,51, 52] is a promising approach for 
specifying properties of programs in the presence of mutable data structures. One 
of the main uses of separation logic in previous approaches is dealing with frame 
conditions [5,28]. In contrast, our paper identifies another use of spatial logic: 
expressing record concatenation. Although our approach is based on essentially 


same logical operation of spatial conjunction, our use of spatial conjunction for 
records is more local, because it applies to the descriptions of the neighborhood 
of an object. 

To remain within the decidable fragment of role logic, we give in Section 7 
a construction that eliminates spatial conjunction when it connects formulas of 
quantifier depth one. This construction also illustrates that spatial conjunction 
is useful for reasoning about counting stars [22] of the two-variable logic with 
counting C?. To our knowledge, this is the first result that combines two-variable 
logic with counting and a form of spatial conjunction. 


Using the resulting logic. We can use specifications written in our notation to 
describe properties and relations between objects in programs with dynamically 
allocated data structures. These specifications can act as assertions, precondi- 
tions, postconditions, loop invariants or data structure invariants [33, 36, 39]. 
By selecting a finite-height lattice of properties for a given program fragment, 
abstract interpretation [15] can be used to synthesize properties of objects at in- 
termediate program points [2,3,24,33,49,50,54,58,59]. Decidability and closure 
properties of our notation are essential for the completeness and predictability 
of the resulting static analysis [38]. 


Contributions. We summarize the main contributions of this paper as follows: 


1. We present a logic which generalizes the concept of records in several direc- 
tions (Section 5). These generalizations are useful for expressing properties 
of objects and memory cells in imperative programs, and go beyond standard 
type systems. 

2. We identify a novel use of separation logic: modelling the concatenation of 
generalized records. 

3. We show how to translate role constraints from role analysis [33] to role logic 
(Section 6). 

4. We show that, under certain syntactic restrictions, we can translate spatial 
conjunction into other constructs of the decidable logic RL? (Section 7). 
We therefore obtain a notation that extends RL? with a convenient way of 
describing record concatenation, and remains decidable. 

5. We present a translation of first-order logic with spatial conjunction and 
inductive definitions into second-order logic (Section 8.2). 


Outline. Section 2 reviews the syntax and semantics of role logic. Section 3 
defines spatial conjunction in role logic and motivates its use for describing record 
concatenation. Section 4 and Section 5 show how to use spatial conjunction in 
role logic to describe a generalization of records. Section 6 demonstrates that our 
notation is a generalization of the local constraints arising in role analysis [33] 
by giving a natural embedding of role constraints into our notation. Section 7 
shows how to eliminate the spatial conjunction connective ® from a spatial 
conjunction F; ® Fh of two formulas F; and Fy when F; and F» have no nested 
counting quantifiers; this is the core technical result of this paper. A consequence 
of this is result is that we may allow certain uses of spatial conjunction in RL? 
fragment of role logic while preserving the decidability property of RL?. Our 


extension of role logic with spatial conjunction is therefore justified: it allows 
record-like specifications to be expressed in a more natural way, and it does not 
lead outside the decidable fragment. Section 8 contains remarks on preserving the 
satisfiability of formulas in the presence of spatial conjunction and shows how to 
encode the spatial conjunction (with inductive definitions) in second-order logic. 
Section 9 presents related work, and Section 10 concludes. Appendix contains 
the details of the correctness proof for the elimination of spatial conjunction 
from Section 7. 


2 <A Decidable Two-Variable Role Logic RL? 


F:=A|fl/EQ| RAAF |-F|F'|~F | card?*F 
e: {1,2} ~D 


[le = [Al(e1) [fle = [(e2,e1) 
[EQ]e = (e2) =(el) 
IR ARle = (Fle) (Rl) Fle = -(F le) 
[Fle = [FM (e2)) Fle = [FIlell > (€2),2 4 (en) 
[card=*FJe = |{d€D| [F](e[l + 0,2 (el)|)}| >k 


Fig. 1. The Syntax and the Semantics of RL? 


Figure 1 presents the two-variable role logic RL” [36]. We have proved in [36] 
that RL? has the same expressive power as two-variable logic with counting 
C?. The logic C? is a first-order logic 1) extended with counting quantifiers 
42*z.F (x), saying that there are at least k elements x satisfying formula F() 
for some constant k, and 2) restricted to allow only two variable names x, y in 
formulas. An example formula in two-variable logic with counting is 


Va.A(x) => (Vy.f(x,y) > 3" x. g(x, y)) (1) 


The formula (1) means that all nodes that satisfy A(x) point along the field f 
to nodes that have exactly one incoming g edge. Note that the variables x and y 
may be reused via quantifier nesting, and that formulas of the form J="z. F(z) 
and 3S". F(x) are expressible as boolean combination of formulas of the form 
2ky, F(x). The logic C? was shown decidable in [22] and the complexity for 
the C? fragment of C? (with counting up to one) was established in [45]. We can 
view role logic as a variable-free version of C?. Variable-free logical notations are 
attractive as generalizations of type systems because traditional type systems 
are often variable-free. The formula (1) can be written in role logic as [A > 
[Lf = card=!~g]] where the construct [F] is a shorthand for -card='=F and 
corresponds to the universal quantifier. The expression ~g denotes the inverse 


of relation g. This paper focuses on the use of role logic to describe generalized 
records, see [36] for further examples of using role logic and [6] for advantages 
of variable-free notation in general. 


3 Spatial Conjunction 


[Ff ® Fae = Fe1, e2. splite[e1 e2] A [Filer A [Fa]e2 
split e [e1 e2] = 


VAE A. Vde D. (eA)d = (ec: A)dV (eg A)d A 7A((e1 A) dA (e2 A)d) A 
Vf € F. Vdi,d2 € D. 
(ef) didz <=> (e1 f) di dz V (e2 f)didz A 7((e1 f) di dz A (e2 f) di da) 
emp = [[A “AA A “fll 
ACA fEF 

priority: \ binds strongest, then ®, then V 
F ~ G means Ve. [F]e = [G]e 
(fF, @ Fo) @F3 ~ Fy @(f2 @ F3) 
F@®emp ~ emp@®F ~ F 
Fier, ~ h@ky 
Fi ®(F2V Fs) ~ Fi@ Fo V F, @ Fs 


Fig. 2. Semantics and Properties of Spatial Conjunction @. 


Figure 2 shows our semantics of spatial conjunction ®. To motivate our use of 
spatial conjunction, we first illustrate how role logic supports the description of 
simple properties of objects in a concise way. Indeed, one of the design goals of 
role logic is to have a logic-based specification language where simple properties 
of objects are as convenient to write as type declarations in a language like Java. 


Example 1. The formula [f = A] is true for an object whose every f-fields points 
to an A object, [g = B] means that every g-field points to a B object, so 


[f > A] A lg > Bl 


denotes the objects that has both f pointing to an A object and g pointing to a 
B object. Such specification is as concise as the following Java class declaration 


class C { Af; B g; } 


Example 1 illustrates how the presence of conjunction A in role logic enables 
combination of orthogonal properties such as constraints on distinct fields. How- 
ever, not all properties naturally compose using conjunction. 


Example 2. Consider a program that contains three fields, modelled as binary 
relations f, g, h. The formula Pp = (card~'f) A (card~°(g V h)) means that 
the object has only one outgoing f-edge and no other edges. The formula Py = 
(card='g) A (card~°(f V h)) means that the object has only one outgoing g-edge 
and no other edges. If we “physically join” two records, each of which has one 
field, we obtain a record that has two fields, and is described by the formula 


Prag = (card=" f) A (card="g) A (card~°h) 


Note that it is not the case that Prg ~ Py A P,. More generally, no boolean 
combination of Py and Pg yields Pg. 


Example 2 prompts the question: is there an operation that allows joining spec- 
ifications that will allow us to combine Py and P, into Pg? Moreover, can we 
define such an operation on records viewed as arbitrary formulas in role logic? 

It turns out that there is a natural way to describe the set of models of formula 
Py, in Example 2 as the result of “physically merging” the edges (relations) of 
the models of Py and models of P,. The merging of disjoint models of formulas is 
the idea behind the definition of spatial conjunction ®@ in Figure 2. The predicate 
(split e [e1 e2]) is true iff the relations of the model (environment) e can be split 
into e; and eg and the notation generalizes to splitting into any number of 
environments. 


Example 3. For Py, Pg, and Prg of Example 2, we have Prg = Pr ® Pa. 


Note that the operation ® is associative and commutative. The formula emp, 
which asserts that all predicates are false, is the unit for ®. Moreover, ® dis- 
tributes over V. 


A note on relationship with [28]. The semantics of spatial conjunction in 
Figure 2 match the semantics of [28], with two differences. 

A small technical difference is that Figure 2 splits the edges of the model 
(the tuples of the relations), whereas [28] splits the domain. The difference arises 
because the elements of the domain in [28] are locations, whereas the elements 
of our models are objects. To represent a location in our view, we would use a 
tuple (0, f) where o is an element of the domain and f is a field name. 

A higher-level difference is that the use of spatial logic we propose in this 
paper is the notation for records (Section 5), as opposed to the description of 
global heap properties. When used for formulas of quantifier depth one (Sec- 
tion 7), spatial conjunction does not even change the set of definable relations 
of two-variable logic with counting. 


4 Field Complement 


As a step towards record calculus in role logic, this section introduces the notion 
of a field complement, which makes it easier to describe records in role logic. 


Example 4. Consider the formula Py = (card~' f) A (card~°(gVh)) from Exam- 
ple 2, stating the property that an object has only one outgoing f-edge and no 
other edges. Property Py has little to do with g or h, yet g and h explicitly occur 
in Pr. Moreover, we need to know the entire set of relations in the language to 
write Py; if the language contains an additional field 7, the property Py would 
become Ps = (card~'f) A (card~°(g VA Vi)). Note also that f is not the same 
as g VhVi, because —f computes the complement of the value of the relation f 
with respect to the universal set, whereas g V h Vz is the union of all relations 
other than f. 


To address the notational problem illustrated in Example 4, we introduce the 
symbol edges, which denotes the union of all binary relations, and the notation 
—f (field complement of f), which denotes the union of all relations other than 


edges = Vg -f=Vq 
g g#f 


This additional notation allows us to avoid explicitly listing all fields in the 
language when stating properties like Py. 


Example 5. Formula Py from Example 4 can be written as Pp = (card! f) A 
(card~°—f), which mentions only f. Even when the language is extended with 
additional relations, Pr still denotes the intended property. Similarly, to denote 
the property of an object that has outgoing fields given by Py and has no in- 
coming fields, we use the predicate Pr A card~°~edges. 


We use the notation edges and —f to build the notation for records and inverse 
records in Section 5 below. 

A note on ternary relation interpretation. It is possible to provide a 
notation for relations that generalizes the notation edges and —f. The idea of 
this generalization is to change the definition of the model (environment). Instead 
of a model that specifies a binary relation for each field, the model specifies the 
value of one ternary relation H and a unary tag-predicate for each field name. 
For example, instead of the model that provides interpretations f; and g; for two 
binary relations f and g, we could use the model that provides interpretation of 
[H], where 

[H]o1 02n = (n=fo A fro 02) V 


(n=90 A fr 01 02) 


and the interpretation of unary tag-predicates f and g. Here fp is an element 
of the domain that tags tuples coming from [f], whereas go tags tuples coming 
from [g]. We interpret f as a predicate that is true only on the element fo, and 
similarly g as a predicate true only on the element go. We then introduce the 
following dereferencing shorthand: 


TF = {HAF} (2) 


The expression | f now denotes the original interpretation of f, that is, [Tf] = fr. 
Moreover, }=f corresponds to field complement —f, and ?True corresponds to 


edges. Note that the expressions of the form }(=f A -g) are now also avail- 
able. Let B be a boolean combination of unary predicates denoting fields. These 
unary predicates are disjoint, so transforming B into disjunctive normal form 
and applying the property 


1 (By V Bo) =1Bi V TBs 


which follows from (2), allows transforming |B into a boolean combination of 
expressions of the form 7f and Tg. This means that we obtain no additional 
expressive power using expressions of the form |B where B is a boolean combi- 
nation of unary predicates denoting fields, so for simplicity we do not consider 
such “ternary relation interpretation” further in this paper. 


5 Records and Inverse Records 


In this section we use role logic with spatial conjunction and field complement 
from Section 4 to introduce a notation for records. We also introduce inverse 
records, which are dual to records, and correspond to slot constraints in role 
analysis [33]. 


multifield: f 7 A = card=°(—f V (f A-A)) 


field: f > A= card°(AAf) A fA 
s of the form =k, <k, or >k, for k € {0,1,2,...} 


fa AS SrA 
multislot: A“ f = card=°(~—f V (~f A -A)) 


slot: A“ f = card*(AAnf) A Af 
s of the form =k, <k, or >k, for k € {0,1,2,...} 
AR fSAe fF 
fm ::= field | multifield 
closedRecord ::= fm | closedRecord ® fm 
openRecord ::= closedRecord ® True 
sm ::= slot | multislot 
closedInvRecord ::= sm | closedInvRecord ® sm 

openInvRecord ::= closedInvRecord ® True 


Fig. 3. Record Notation 


Figure 3 presents the notation for records and inverse records. A field predi- 
cate f — A is true for an object whose only outgoing edge in the graph (model) is 
an f-edge terminating at A. Dually, a slot predicate A< f is true for an object 
whose only incoming edge in the graph is an f-edge originating at A. A multifield 
predicate f — A is true iff the object has any number of outgoing f-edges termi- 
nating at A, and no other edges. Dually, a multislot predicate A“ f is true iff 
the object has any number of incoming f-edges originating from A, and no other 
edges. We also allow notation f >A where s is an expression of the form =k, 
<k, or >k. This notation gives a bound on the number of outgoing edges, and 
implies that there are no other outgoing edges. We similarly introduce A@ f. A 
closed record is a spatial conjunction of fields and multifields. An open record is 
a spatial conjunction of a closed record with True. While a closed record allows 
only the listed fields, an open record allows any number of additional fields. In- 
verse records are dual to records, and we similarly distinguish open and closed 
inverse records. 


Example 6. To describe a closed record whose only fields are f and g where 
f-fields point to objects in the set A and g-fields point to objects in the set 
B, we use the predicate P) = f—A ® g—B. The definition of P, lists all 
fields of the object. To specify an open record which certainly has fields f and g 
but may or may not have other fields, we write P2 = f-A ® g—-B@True. 
Neither P; nor P, restrict incoming references of an object. To specify that 
the only incoming references of an object are from the field h, we conjoin P» 
with the closed inverse record consisting of a single multislot True < h, yielding 
the predicate P; = Py A Trueh. To specify that an object has exactly 
one incoming reference, and that the incoming reference is from the h field and 
originates from an object belonging to the set C, we use Py = Py A CHA. 
Note that specifications P3; and Py, go beyond most standard type systems in 
their ability to specify the incoming (in addition to the outgoing) references of 
objects. 


6 Role Constraints 


Role constraints were introduced in [30,31, 33]. In this section we show that role 
logic is a natural generalization of role constraints by giving a translation from 
role constraints to role logic. A logical view of role constraints is also suggested 
in [35,35]. A role is a set of objects that satisfy a conjunction of the following 
four kinds of constraints: field constraints, slot constraints, identities, acyclicities. 
In this paper we show that role logic naturally models field constraints, slot 
constraints, and identities. ! 

Roles describing complete sets of fields and slots. Figure 4 shows the 
translation of role constraints [33, Section 3] into role logic formulas. The sim- 
plicity of the translation is a consequence of the notation for records that we 
have developed in this paper. 


1 Acyclicities go beyond first-order logic because they involve non-local transitive closure 
properties. 


C[fields F’; slots S; identities I; acyclic A] = C[fields F] A C[slots S] A 
[identities I] A [acyclic A] 
C[fields fi: S1,...,fn:Sn] = fr-S1 ® ... ® fra>Sn 
C[slots $1.fi,...,Sn-fn] = Sicfi ® ... ® Sn—fn 
[identities fi.gi,..., fn-gn] = Avalfi > ~9i 
[acyclic fi,..., fn] = acyclic (Vi, fi) 


Fig. 4. Translation of Role Constraints [33] into Role Logic Formulas 


O[fields F'; slots S; identities I; acyclic A] = O[fields F] A O[slots S] A 
[identities I] A [acyclic A] 

Offields fi :91,..., fn: Sn] =C[fields fi: S1,..., fn: Sn] @card (VW, fi) 

Ol gis, Gm slots S1:fi,...Snifa] = Clelote $1: fix... Sn.fn] @ ecard" (Vir4 ~g;) 


Fig. 5. Translation of Simultaneous Role Constraints [33, Section 7.2] into Role Logic 
Formulas. See also Figure 4. 


Simultaneous Roles. In object-oriented programs, objects may participate 
in multiple data structures. The idea of simultaneous roles [33, Section 7.2] is 
to associate one role for the participation of an object in one data structure. 
When the object participates in multiple data structures, the object plays mul- 
tiple roles. Role logic naturally models simultaneous roles: each role is a unary 
predicate, and if an object satisfies multiple roles, the the object satisfies the 
conjunction of predicates. Figure 5 presents the translation of field and slot con- 
straints of simultaneous roles into role logic. Whereas the roles of [33, Section 
3] translate to closed records and closed inverse records, the simultaneous roles 
of [33, Section 7.2] translate specifications that are closer to open records and 
open inverse records. 


7 Eliminating Spatial Conjunction in RL? 


Preserving the decidability. Previous sections have demonstrated the use- 
fulness of adding record concatenation in the form of spatial conjunction to our 
notation for generalized records. However, a key question remains: is the result- 
ing extended notation decidable? In this section we give an affirmative answer 
to this question by showing how to compute the spatial conjunction using the 
remaining logical operations for a large class of record specifications. 
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Approach. Consider two formulas F and F4 in first-order logic with counting, 
where both F and F> have quantifier depth one. An equivalent way of stating 
the condition on F, and F» is that there are no nested occurrences of quantifiers. 
(Note that we count one application of J7*x. P as one quantifier, regardless of 
the value k.) We show that, under these conditions, the spatial conjunction 
F,, ® Fy can be written as an equivalent formula F3 where F3 does not contain 
the spatial conjunction operation ®. The proof proceeds by writing formulas F}, 
F2 in anormal form, as a disjunction of counting stars [22], and showing that the 
spatial conjunction of counting stars is equivalent to a disjunction of counting 
stars. 

As a consequence of the results in this section, adding the operation ® to 
logic with counting does not change its expressive power provided that both Fy 
and F > have quantifier depth at most one. Here we allow F, and F2 themselves 
to contain spatial conjunction, because we may eliminate spatial conjunction in 
F, and F» recursively. Applying these results to two-variable logic with counting 
C?, we conclude that introducing into C? the spatial conjunction of formulas 
of quantifier depth one preserves the decidability of C?. Furthermore, thanks to 
the translations between C? and RL? in [36], if we allow the spatial conjunction 
of RL? formulas with no nested card occurrences, we preserve the decidability of 
the logic RL”. The formulas of the resulting logic are given by 


F:=A|fl/EQ|AAR|-F| F’|<F | card2*F 


| F, ® Fo, if Fy and F2 have no nested card occurrences 


Note that record specifications in Figure 3 contain no nested card occurrences, 
so joining them using ® yields formulas in the decidable fragment. Hence, in 
addition to quantifiers and boolean operations, the resulting logic supports a 
generalization of record concatenation, and is still decidable; this decidability 
property is what we show in the sequel. We present the sketch of the proof, see 
Appendix for proof details.. 


7.1 Atomic Type Formulas 


In this section we introduce classes of formulas that correspond to the model- 
theoretic notion of atomic type [44, Page 20] (see [25, Page 42] and [12, Page 78] 
for the notion of type in general). We then introduce formulas that describe the 
notion of counting stars [22,45]. We conclude this section with Proposition 12, 
which gives the normal form for formulas of quantifier depth one. 

If C =C\,...,Cm is a finite set of formulas, then a cube over C is a conjunc- 
tion of the form Cf A...C2m where a; € {0,1}, C! = C and C® = -C. For 
simplicity, fix a finite language L = AUF with A a finite set of unary predicate 
symbols and F a finite set of binary predicate symbols. We work in predicate cal- 
culus with equality, and assume that the equality “=”, where = ¢ F, is present 
as a binary relation symbol, unless explicitly stated otherwise. We use D to 
denote a finite domain of interpretation and e to denote a model with variable 
assignment; e maps A to 2?, maps F to 2?*? and maps variables to elements 
of D. Let 21,...,2n be a finite list of distinct variables. Let C be the set of all 
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atomic formulas F' such that FV(F’) C {a1,...,%,}. The set C is finite (in our 
case it has |A|n + (|F|+1)n? elements). We call a cube over C a complete atomic 
type (CAT) formula. 


Example 7. If A= {A} and F = {f}, then 
A(x1) A 7A(a2) A 
af (v1, 21) Aaf (a2, 22) A f (v1, £2) Aaf (2,21) A 
M1 = M1 AGQ=A22AMF#AX2AL2F U1 

is a CAT formula. 


We may treat conjunction of literals as the set of literals, so we say that “a literal 
belongs to the conjunction” and apply set-theoretic operations on conjunctions 
of literals. 

From the disjunctive normal form theorem for propositional logic, we obtain 
the following Proposition 8. 


Proposition 8. Every quantifier-free formula F such that FV(F) 
{@1,..-,%n} is equivalent to a disjunction of CAT formulas C such that FV(C) 
{£1, a ere RN 


ie) 


A CAT formula may be contradictory if, for example, it contains the literal 
x; # X; as aconjunct. We next define classes of CAT formulas that are satisfiable 
in the presence of equality. Let 71,...,2 be distinct variables. A general-case 
CAT (GCCAT) formula is a CAT formula F' such that the following two condi- 
tions hold: 1) FV(F’) = {a1,...,%n}; 2) for all 1 <7,7 <n, the conjunct x; = 2; 
isin F iffi =j. Let r1,...,%p and y1,..-,Y%m be distinct variables. An equality 
CAT (EQCAT) formula is a formula of the form Jj", y; = zi, A F, where 
1<i1,...,im <nand F is a GCCAT formula such that FV(F’) = {a1,...,2n}-. 


Lemma 9. Every CAT formula F is either contradictory, or is equivalent to an 
EQCAT formula F’ such that FV(F’) = FV(F). 


From Proposition 8 and Lemma 9, we obtain the following Proposition 10. 


Proposition 10. Every quantifier-free formula F such that FV(F) C 
{x1,..-,@n} can be written as a disjunction of EQCAT formulas C such that 
FV(C) = {21,...,2n}. 


We next introduce the notion of an extension of a GCCAT formula. Let 
L,X1,-.--,Xn be distinct variables and F' be a GCCAT formula such that 
FV(F’) = {a1,...,¢n}. We say that F’ is an z-extension of F’, and write 
F’ € exts(F, x) iff all of the following conditions hold: 1) FA F” is a GCCAT 
formula; 2) FV(F A F’) = {x,21,...,%n}; 3) F and EF” have no common atomic 
formulas. Note that if FV(f) = FV(F%), then exts(F\,x) = exts(F2,x) i.e. the 
set of extensions of a GCCAT formula depends only on the free variables of the 
formula; we introduce additional notation exts(71,...,%p,2) to denote exts(F, x) 
for FV(F) = {a1,...,¢n}. 
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To define a normal form for formulas of quantifier depth one, we introduce 
the notion of k-counting star. If p > 2 is a non-negative integer, let pt be 
a new symbol which represents the co-finite set of integers {p,p + 1,...}. Let 
Cy = {0,1,...,p—1,p*}. Ifc € Cp, by J’z. P we mean 4=‘z. P if i is an integer, 
and 42?z. P if i = p+. We say that a formula F has a counting degree of at most 
p iff the only counting quantifiers in F' are of the form 4°x. G for some c € Cp41. 


Definition 11 (Counting Star Formula). Let x, 71,...,2%n, and Yi,---,Ym 
be distinct variables, k > 1 a positive integer, and F a GCCAT formula such 
that FV(F) = {a1,...,2%}. A k-counting star function for F' is a function y : 
exts(F, x2) > Cy41. A k-counting-star formula for y ts a formula of the form 


A yj = Ui; \ FAN aN GPO gy Be 


j=1 F’ €exts( Fx) 


where 1 <%1,...,4m <n. 


Note that in Definition 11, formula en yj = vi, \F is an EQCAT formula, and 
formula Ai") yj = vi, \F A F" is an EQCAT formula for each F’ € exts(F, 2). 

The following Proposition 12 shows that formulas of quantifier depth at most 
one are equivalent to disjunctions of counting stars. 


Proposition 12 (Depth-One Normal Form). Let F' be a formula of such 
that F has quantifier depth at most one, F has counting degree at most k, and 
FV(F’) C {a1,...,%n}. Then F is equivalent to a disjunction of k-counting-star 
formulas Fo where FV(Fo) = {21,.-.,%n}. 


7.2 Spatial Conjunction of Stars 


Sketch of the construction. Let F, and F 2 be two formulas of quantifier depth 
at most one, and not containing the logical operation ®. By Proposition 12, let 
F, be equivalent to the disjunction of counting star formulas \/j2, Ci,; and let 
Fy be equivalent to the disjunction of counting star formulas Viet Co,;. By 
distributivity of law of ® with respect to V, we have 


ny ne 


ny n2 
FL @ PF, ~ (VV C1) @(\VV C25) ~ V OV Cis @ C25 
j=l 


i=l i=1j=1 


In the sequel we show that a spatial conjunction of counting-star formulas is 
either contradictory or is equivalent to a disjunction of counting star formulas. 
This suffices to eliminate spatial conjunction of formulas of quantifier depth at 
most one. Moreover, if F' is any formula of quantifier depth at most one, possibly 
containing ®, by repeated elimination of the innermost ® we obtain a formula 
without ®. 

To compute the spatial conjunction of counting stars we establish an alter- 
native syntactic form for counting star formulas. The idea of this alternative 
form is roughly to replace a counting quantifier such as J-*x. F’ with a spatial 
conjunction of k formulas each of which has the meaning similar to J=!x. F’, and 
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then combine a formula 4=!z. F{ resulting from one counting star with a formula 
d='v. FS resulting from another counting star into the formula 4=!z. (F/ © F%) 
where © denotes merging of GCCAT formulas by taking the union of their pos- 
itive literals. We next develop this idea in greater detail. 


Notation for spatial representation of stars. Let Ge(a1,...,2%n) be the 
unique GCCAT formula F' with FV(F’) = {1,...,%n} such that the only positive 
literals in F are literals x; = x; for 1 <i <n. Similarly, there is a unique formula 


F’ € exts(x1,...,@n,x) such that every atomic formula in F”’ distinct from for 
x = @ occurs in a negated literal. We call F’ an empty extension and denote it 
empEx(#1,...,2n,2). 


To compute a spatial conjunction of formulas C and C2 in the language L, 
we temporarily consider formulas in an extended language L’ = LU {B,, Bo} 
where B, and Bz are two new unary predicates used to mark formulas. We use 
B, to mark formulas derived from C,, and use Bz to mark formulas derived from 
Cy. For m € {@, {1}, {2}, {1, 2}}, define 


Markg(a) = —Bi(x%) A 7Ba(a) Mark: (x) = Bi(x) A =Ba(x) 
Mark2(a) = 7Bi(x) A Bo(@) ~~ Marki,2(@) = Bi(ax) A Bo(a) 


Note that, when we say that Fis a GCCAT formula, we mean that F is GCCAT 
formula in language L (and thus F' mentions symbols only from L), even when 
we use F' as a subformula of a larger formula in language L’. Similarly, expres- 
sions exts(x1,...,%n,x), empEx(F, x), and Gp(a1,...,2n) all denote formulas in 
language L. 

On the other hand, empExg(F,x) and empe are formulas in language L’. 
Formula empEx,(F, xz) is an empty extension of F' in language L’. Formula empe 
asserts that 71,...,2%p have an empty GCCAT formula and that the remaining 
elements have empty extension in L’. Formula empe does not constrain the values 
B,(a;) and Bo(x;), these values turn out to be irrelevant. 

Let F” € exts(x1,...,%n,x). Define 


empExg(#1,..-,@n,£) = empEx(21,...,@n,2) A Markg (a) 
empe(21,...,2n) = Gu(@1,...,%n) A Va. (Aj, © A vi) > empExg(r1,...,¢n, 2) 


We write empExg(F, x) for empExg(11,...,%n,x) if FV(F) = {a1,..., 2}, and 
similarly for empe(F, x). We write simply empe if F and x are understood. 

We next introduce formulas (F’)*, and (F’)m, which are the building blocks 
for representing counting star formulas. Formula (F’)*, means that F’ marked 
with m and empExg(f, x) are the only extensions of F that hold in the neigh- 
borhood of 21,...,% (F’ may hold for any number of neighbors). Formula 
(F")m means that F’ holds for exactly one element in the neighborhood of 
@1,.--,%,, and all other neighbors have empty extensions. More precisely, let 
F’ € exts(z1,...,2n, 2). Define 


(Fi, = Ga(a1,...,¢n) A Va.(AjL, & # vi) => (F’ A Markm(x)) V empExg(F, x) 
(Fm =(F VR A ha. Ae Axi A PF’ A Markn(z) 


where m € {0, {1}, {2}, {1, 2}}. Observe that G@empe ~ Gif G = (F’)5, or 
G = (F')m for some F’ and m. Also note that (F’)*, @(F’)%, ~ (E')s,. 
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EA F — EQCAT formula 
F  — GCCAT formula 


Sm[EAF A332. Fi A... AD *a. Fy] = 
= EA KF] @4%n[22.F{] ®...® Xn 2.Fi] 


KLF] = FA (Wa. (Aji, © # vi) > empEx,(F, )) 


3x. F’] = empe 


Am 
Xm [a+ a. F’] = (Fm ® Xm[s'2. F’] 
Am| 


3" 2. F’] =Xp[5' 2. F’] OIF Vs, 


Fig. 6. Translation of Counting Stars to Spatial Notation 


Translation of counting stars. Figure 6 presents the translation of counting 
stars to spatial notation. The idea of the translation is to replace J=*«. F’ with 
the spatial conjunction of k formulas (F’)m ®...@(F")m where m € {{1}, {2}}. 
The purpose of the marker m is to ensure that each of the k witnesses for x that 
are guaranteed to exist by (F’)m ®...@®(F")m are distinct. The reason that the 
witnesses are distinct for m # @) is that no two of them can satisfy B;(a) at the 
same time for 7 € m. 


To show the correctness of the translation in Figure 6, define e” to be the 
L’-environment obtained by extending L-environment e according to marking 
m, and @; to be the restriction of an L’ environment e; to language L. More 
precisely, if e is an environment in language L, for m € {0, {1}, {2}, {1, 2}}, 
define environment e™ in language L’ by 1) e™r = er for r € L and 2) for 
q € {1,2}, let (eBy)d = True = > qema d¢ {ex1,...,e%n}. Conversely, 
if e; is an environment in language L’, define environment €; in language L by 
@jr = e,r for all r € L. Lemma 13 below gives the correctness criterion for 
translation in Figure 6. 


Lemma 13. Ife is an environment for language L, C a counting star formula 
in language L, and m € {{1}, {2}, {1,2}}, then [C]e = S,,[C]e”™. 


(1) (Tips @(T2)2~ (Ti © Te) 1,2 

(2) (Ti)i @(Ta)2 ~ (Ti © Ta)i,2 @(Ta)2 

(3) (Ti)i @(Ta)2~ (Ti)t @(N © Ta)i,2 

(4) (Ti)i @(T2)5 ~ (Ti)i @(Ta)3 @(Ti © Ta)i2 
(5) (T)i ~ empe 

(6) (T)3 ~ empe 


Fig. 7. Transformation Rules for Combining Spatial Conjuncts 
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Combining quantifier-free formulas. Let C; ® C2 be a spatial conjunction 
of two counting-star formulas 


CQL =EAFLA 3 1a.F 4 Bess 
Co=ENF2A 4°21 23 y N.. 


Leg FY 
2k Fy 


LAF 
LAF 


where F, and F2 are GCCAT formulas with FV(F\) = FV(F2) = {21,...,2n}, 
E A F, and E A Fy are EQCAT formulas, and FE = Aja = Ty. 

Note that we assume that the two GCCAT formulas F, and Fy have same 
free variables and that the equalities & in the two EQCAT formulas are the 
same. This assumption is justified because either 1) C; ® Cp make inconsistent 
assumptions about equalities among x1,...,2,,, and therefore C, ® C2 is equiv- 
alent to False, or 2) C; ®C2 make same assumptions about equalities among 
X1,---,Xn, SO we can rewrite C; and C2 to satisfy the our assumption by ex- 
changing variables x; and y; in the definition of an EQCAT formula. 

To show how to transform formula S)[C)] ® S2[C2] into a disjunction of 
formulas of the form Sj,2[C3], we introduce the following notation. If T is a 
formula, let S(T’) denote the set of positive literals in T, that do not contain 
equality. Let T; € exts(Fi,x) and Ty € exts(F2,x). (Note that exts(Fi,7) = 
exts(£>,2).) We define the partial operation T, © T2 as follows. The result of 
T, © T> is defined iff S(T1) N S(T2) = 0. If S(T1) N S(T2) = 0, then T, © T> = T 
where T is the unique element of exts(f1,x) such that S(T) = S(T) U S(T). 
Similarly to ©, we define the partial operation F ® Fp for F, and Fy GCCAT 
formulas with FV(F) = FV(£)) = {x1,...,¢n}. The result of F) @ F> is defined 
iff S(F\) Nn S(F2) = 0. If S(F,) Nn S(F2) = 0, then Fy @ Fy is the unique GCCAT 
formula F such that FV(F) = {a1,...,¢%} and S(F’) = S(F\) U S(F2). The 
following Lemma 14 notes that © and © are sound rules for computing spatial 
conjunction of certain quantifier-free formulas. 


Lemma 14. If 7T\,T2 € exts(#1,...,%n,2) thenT, ®T> ~ T\@©T». If Fy and 
Fy are GCCAT formulas with FV(F,) = FV(F2) = {21,...,2n}, then Fi ® Fh ~ 
F, ® Fo. 


Rules for transforming spatial conjuncts. We transform formula 
S,[Ci] ® S2[C2] into a disjunction of formulas of the form S,2][C3] as follows. 

The first step in transforming C; ®C2 is to replace K[F\] ®K[F2] with 
KF. © Fo] if Fi © Fy is defined, or False if F\ @ F is not defined. 

The second step is summarized in Figure 7, which presents rules for com- 
bining conjuncts resulting from 1¥;[5*!.F\] and 42[5°2.F)] into conjuncts of 
the form 4,2[5°r.F]. The intuition is that (T)*, and (T)m represent a finite 
abstraction of all possible neighborhoods of 71,...,%n, and the rules in Figure 7 
represent the ways in which different portions of the neighborhoods combine us- 
ing spatial conjunction. We apply the rules in Figure 7 modulo commutativity 
and associativity of ®, the fact that emp is a unit for ®, as well as the idempo- 
tence of (7)*,. Rules (1)—(4) are applicable only when the occurrence of T; © T> 
on the right-hand side of the rule is defined. We apply rules (1)—(4) as long as 
possible, and then apply rules (5), (6). Moreover, we only allow the sequences of 
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rule applications that eliminate all occurrences of (T)i, (T)i, (T)2, (7)5, leaving 
only (T)1i,2 and (T){2. Note also that the are only finitely many non-equivalent 
expressions that can be obtained by sequences of applications of rules in Fig- 
ure 7. Namely, an application of rules (1)—(3) decreases the total number of 
spatial conjuncts of the form (JT); and (T)2, multiple applications of rule (4) to 
the same pair of spatial conjuncts are unnecessary because of the idempotence 
of (T; © T2)j.2 (so we never perform them), and rules (5), (6) reduce the total 
number of spatial conjuncts. The following Lemma 15 gives partial correctness 
of rules in Figure 7. 


Lemma 15. If G) ~ G2, then Go => G is valid. 


Define G, men G2 to hold iff both of the following two conditions hold: 1) 
G2 results from G, by replacing K[ Fi] ®K[F2] with K[F, ® Fy] if Fi © F% is 
defined, or False if F @ F» is not defined, and then applying some sequence of 
rules in Figure 7 such that rules (5),(6) are applied only when rules (1)—(4) 
are not applicable; 2) Gz contains only spatial conjuncts of the form (T\)i,2 and 
(Z)j.2- From Lemma 15 and Lemma 14 we immediately obtain Lemma 16. 


Lemma 16. If G; == Go, then Gz => Gi is valid. 


The rule for computing the spatial conjunction of counting star formulas is the 
following. If C,, C2, and C3 are counting star formulas, define R(C), C2, C3) to 
hold iff S; [Ci] ® S2[C2] aoe S},2|C3]. We compute spatial conjunction by replac- 
ing C, ® C2 with VR(C1,C2,Cs) C3. Our goal is therefore to show the equivalence 


C1@Cy ~ V C3 (3) 
R(C1,C2,C3) 


The validity of VR(C1,C2,Cs) C3 => (Ci ®C2) follows from Lemma 16 and 
Lemma 13. 


Lemma 17. (Va(c,,03,C3) C3) > (C1 ® C2) is a valid formula for every pair of 
counting star formulas Cy and C3. 


We next consider the converse claim. If [C, ® C2]e, then there are e; and e2 such 
that splitee;e2, [Ci]Jei1, and [C2]e2. By considering the atomic types induced 
in e, e; and eg by elements in D \ {e21,...,e2,}, we construct a sequence 
of ~ transformations in Figure 7 that convert S;[C\] ® S2[C2] into a formula 
S},2[C3] such that [C3]e = True. 


Lemma 18. C, ®C2 => VR(C1,C2,Cs) C3 is a valid formula for every pair of 
counting star formulas Cy and C3. 


From Lemma 17 and Lemma 18 we obtain the desired Theorem 19, which 
shows the correctness of our rules for computing spatial conjunction of formulas 
of quantifier depth at most one. 


Theorem 19. The equivalence (3) holds for every pair of counting star formulas 
Cy and C2. 
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8 Further Remarks 


In this section we present two additional remarks regarding spatial conjunction. 
The first remark notes that we must be careful when extracting a subformula 
from a formula and labelling it with a new predicate. The second remark shows 
how to encode spatial conjunction in second-order logic, thus providing some 
insight into the expressive power of spatial conjunction. 


8.1 Extracting Subformulas in the Presence of ® 


In two-variable logic with counting C? we may efficiently transform formula into 
an unnested form by introducing new predicate names and naming subformulas 
using these predicates. This transformations is a standard step in decidability 
proofs for two-variable logic with counting [22, 45]. 

The satisfiability of the resulting formula is equivalent to the satisfiability of 
the original formula. An extraction of a subformula G and its replacement with 
a new predicate P can be justified by a substitution lemma of the form: 


[F[P := G]le = [FI(elP := [G]e)) 


where e is the environment (model). This substitution lemma does not hold in 
the presence of spatial conjunction that splits the values of newly introduced 
predicates. Namely, 


[(F, ® Fy)[P :=Glle => [fi @ Fo|(el[P := [G]e]) 


holds, but the converse implication does not hold because the value [G]e of the 
relation P might be split on the right-hand side. 

It is therefore interesting to divide predicates into splittable and non-splittable 
predicates, and have spatial conjunction split only the interpretations of split- 
table predicates. The substitution lemma then holds when P is a non-splittable 
predicate. 

Note, however, that in the presence of non-splittable predicates we cannot 
translate counting stars into spatial notation and thus use unnested form to 
eliminate all spatial conjunctions from first-order formulas. As a result, adding 
spatial conjunction of formulas of large quantifier depth to two-variable logic 
with counting may increase the expressive power of the resulting logic. 

We also remark that if the language contains only one splittable unary predi- 
cate Ag, then it is easy to simulate the splitting of objects of the universe, which 
is the semantics of spatial conjunction in [28]. Namely, we use some fixed unary 
predicate Ag to denote all “live” objects, and make all quantifiers range only 
over the objects that satisfy Ag. 


8.2 Representing ® in Second-Order Logic 


In this section we give a simple translation from the first-order logic with spatial 
conjunction and inductive definitions [27, Chapter 4] to second-order logic. This 
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gives an upper bound on the expressive power of first-order logic with spatial 
conjunction and inductive definitions. 


Consider first-order logic extended with the spatial conjunction ® and the 
least-fixpoint operator. The syntax of the least-fixpoint operator is 


(Ifp P,a1,---,2n-F)(yi,---, Yn) 
where F is a formula that may contain new free variables P,71,...,0%,. The 
meaning of the least-fixpoint operator is that the relation which is the least 
fixpoint of the monotonic transformation on predicates 


(Aq, ..-;Un-P(a@1,...,0n)) Pe (Ani,...,Un-F) 


holds for y1,..., Yn. To ensure the monotonicity of the transformation on pred- 
icates, we require that P occurs only positively in F’. 


A= {Ai,...,An} 


F = {fi,.-+s fm} 
[F’ @ F’] =3Ab,..., Ans fi,---> fins 
1... AN SY... fh. BLP’ @ FJ 
BF’ ® FF’) =: 
A (split, A; Ai AY) A A Gplit, LL GIA 


[FAs = Ala lfi = fTEa 
split, A A’ A” = Vax. (A(x) & (A’(x) V Al" (a))) A 


split, f f! f" = Way. (f(x,y) = (f(a) Vf" (@,y))) A 
a(f"(a,y) A f"(@,y)) 


[(lfp P, v1,---,2n-F)(y1,---,Yn)] = 
VP. (Va1,...,0n.(F & P(a1,...,2n))) > Plyi,---, yn) 


Fig. 8. Translation of Spatial Conjunction and Inductive Definitions into Second-Order 
Logic 


Figure 8 presents the translation from first-order logic extended with spatial 
conjunction and least-fixpoint operator to second-order logic. The translation 
directly mimics the semantics of ® and Ifp. 

In second-order logic, the relations in L = AUF become free variables. 
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To translate ®, use second-order quantification to assert the existence of 
new unary and binary relations that partition the relations in L into relations 
in L’ and L”. Then perform a syntactic replacement of relations in L with the 
corresponding relations in L’ for the first formula, and with the corresponding 
relations in L” for the second formula. 

Translating Ifp is also straightforward. The property that P is a fixpoint of 
F is easily expressible. To encode that y1,..., Yn hold for the least fixpoint of F’, 
we state that y1,...,Yn hold for all fixpoints of F’, using universal second-order 
quantification over P. 

We also note that the translation of ® in Figure 8 uses only existential 
second-order quantification, which points to another class of formulas where 
spatial conjunction can be eliminated if we are only concerned with satisfiability. 
Namely, if F’ and F” are first-order formulas (without @ or Ifp), then F’ ® F’” 
is satisfiable iff the first-order formula 6[F’ @ F”’] in the extended language is 
satisfiable. As a slight generalization, define the following class of “interesting” 
formulas: 


1. a first-order formula F is an interesting formula; 
2. if F, and F»> are interesting formulas, so is F, ® F9; 
3. if F, and F> are interesting formulas, so is Fy V F5 


The satisfiability of each interesting formula is equivalent to the satisfiability of 
the corresponding first-order formula in an extended vocabulary. In particular, 
the satisfiability of the class of formulas formed starting from formulas in two- 
variable logic with counting and applying only V and @ is decidable. 


9 Further Related Work 


Records have been studied in the context of functional and object-oriented pro- 
gramming languages [11, 14, 23,29, 42, 46-48,57]. The main difference between 
existing record notations and our system is that the interpretation of a record in 
our system is a predicate on an object, where an object is linked to other objects 
forming a graph, as opposed to being a type that denotes a value (with values 
typically representable as finite trees). Our view is appropriate for programming 
languages such as Java and ML that can manipulate structures using destruc- 
tive updates. Our generalizations allow the developers to express both incoming 
and outgoing references of objects, and allow the developers to express typestate 
changes. 

We have developed role logic to provide a foundation for role analysis [30-33]. 
We have subsequently studied a simplification of role analysis constraints and 
showed a characterization of such constraints using formulas [34,35]. Multifields 
and multislots are present already in [32, Section 8.1]. In this section we have 
shown that role logic provides a unifying framework for all these constraints 
and goes beyond them in 1) being closed under the fundamental boolean logical 
operations, and, 2) being closed under spatial conjunction for an interesting class 
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of formulas. The view of roles as predicates is equivalent to the view of roles as 
sets and works well in the presence of data abstraction [39, 40]. 

The parametric analysis based on there-valued logic was introduced in [53, 
54]. Other approaches to verifying shape invariants include [13, 19-21, 26,41]. A 
decidable logic for expressing connectivity properties of the heap was presented 
in [4]. We use spatial conjunction from separation logic that has been used for 
reasoning about the heap [7, 8,28, 51,52]. Description logics [1,6] share many 
of the properties of role logic and have been traditionally applied to knowledge 
bases. [9,10] present doubly-exponential deterministic algorithms for reasoning 
about the satisfiability of expressive description logics over all structures and 
over finite structures. The decidability of two-variable logic with counting C? 
was shown in [22], whereas [45] establishes the NEXPTIME-complexity of the 
satisfiability problem for the fragment C? with counting up to one. 


10 Conclusions 


We have shown how to add notation for records to two-variable role logic while 
preserving its decidability. The resulting notation supports a generalization of 
traditional records with record specifications that are closed under all boolean 
operations as well as record concatenation, allow the description of typestate 
properties, support inverse records, and capture the distinction between open 
and closed records. We believe that such an expressive and decidable notation is 
useful as an annotation language used with program analyses and type systems. 


Acknowledgements. We thank the participants of the Dagstuhl Seminar 
03101 “Reasoning about Shape” for useful discussions on separation logic and 
shape analysis. 
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A Appendix: Correctness of Spatial Conjunction 
Elimination 


Proposition 8. Every quantifier-free formula F' such that FV(F’) C {a1,..., an} 
is equivalent to a disjunction of CAT formulas C such that FV(C) = {21,...,an}. 


Proof. Let F be a quantifier-free formula and FV(F’) C {a1,...,@n}. Transform 
F to disjunctive normal form F’. Let C' be a conjunction in F’. If C contains 
a literal and its negation, then C is contradictory and we eliminate C from F’. 
Assume all conjunctions are non-contradictory, and let C’ be one conjunction. If 
there exists an atomic formula F', in variables {71,...,7,} such that F4 ¢ C 
and (=F'4) ¢ C, then replace C' with the disjunction 


(C A Fa) V(CA7Fa) 
By repeating this process, we obtain a disjunction of CAT formulas. 


Lemma 9. Every CAT formula F' is either contradictory, or is equivalent to an 
EQCAT formula F” such that FV(F’) = FV(F). 


Proof. Let F be a CAT formula. If x; 4 x; occurs in F’, then F is contradictory. 
If «; = x; occurs in F for i ¥ j, then in all conjuncts other than a; = 2; 
replace all occurrences of x; with x;. Repeat this process as long as it is possible. 
Suppose that the resulting formula was not established to be contradictory. Let 
Y1;-+-;Ym be variables that occur only on the left-hand side of some equality 
yj = v;,. Removing all equalities of the form y; = y; yields an EQCAT formula. 


Proposition 10. Every quantifier-free formula F such that FV(F) C 
{x1,...,%n} can be written as a disjunction of EQCAT formulas C' such that 
FV(C) = {a1,..., an}. 


Proof. Let F' be a quantifier-free formula such that FV(F’) C {a1,..., 2%}. Using 
Proposition 8, transform F’ to disjunction of CAT formulas F,. Then, for each 
conjunct C' of F, apply Lemma 9 to transform C to an EQCAT formula. 


Proposition 12. Let F be a formula of such that F’ has quantifier depth 
at most one, F’ has counting degree at most k, and FV(F) C {a1,...,a@n}. 
Then F is equivalent to a disjunction of k-counting-star formulas Fo where 
FV(Fo) = {x1,...,0n}. 


Proof. Let F be a formula of such that F' has quantifier depth at most one, F 
has counting degree at most k, and FV(F’) C {a1,...,2n}. Then F is a boolean 
combination of 1) atomic formulas and 2) formulas of the form 4*z. F’ where F” 
is quantifier-free and FV(F"’) = {z,21,...,%}. Because z is a bound variable, 
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rename it to x in each formula F’. Let F, be the result of transforming this 
boolean combination to disjunctive normal form. Consider a disjunct C' of F. 
As in the proof of Proposition 10, and treating quantified formulas as atomic 
syntactic entities, transform C' into disjunction of formulas of the form 


\ Yj = Wi; NF A \ (POI 2, PY) 
Fres 


a. 
Il 
nn 


where 3(F") € Cx4i, a(F’) € {0,1} for F’ € S, and where Aj" yj; = wi; AF 
is an EQCAT formula with y1,...,Ym,W1,-.-,Wp distinct variables such that 
{Y1y +05 YmsW1,+++, Wp} = {21,..., an}, and FV(B") C {a2,21,...,2n} for F’ € 
S. Here S is the set of formulas of the form 3°(")x. F’ that end up conjoined 
with the EQCAT formula as the result of transformation to normal form. By 
replacing each y; with w;, in each F’, enforce that FV(F’) C {a,w1,..., Wp}. 
Using Proposition 10, transform each F’ to a disjunction of EQCAT formulas. 
By applying the equivalences 


q q 
21g, VV By ~ \ s2!ia. B; 
i=1 q i=l 
De y=ka 
j=l 
q son 
STP Nl Bes A a-*a. B; 
i=1 qd i=l 
ys 1; =ky 
j=l 
for B,,...,B, mutually exclusive, and propagating the disjunction to the top 


level, ensure that every F’ is an EQCAT formula. Then transform each term 
(42x. F’)") into positive boolean combination of formulas of one of the 
forms 3=*x. F’ for 0 <i < k and 32*+!y. F’, using the properties 


ky—-1 


ad2*1¢.F! ~ Vo APs. F’ 
i=0 


aqehig, BF’ ow V F='y. F’ v y2ktly, F’ 
t€{0,...,k}\ {ki} 


Next ensure that each F’ is not merely an EQCAT, but in fact a GCCAT such 
that F’ € exts(F, 2), as follows. 

Suppose that F’ contains a literal L; complementary to some literal occurring 
in GCCAT formula F. If L; occurs in 3=*ax. F’ for i > 0 or in 32*+"g. F’, then 
the entire conjunct is contradictory and we eliminate it. If L, occurs in 4~°z. F’, 
then J-°x. F’ is implied by F, so eliminate it. Assume that F’ has no literals 
complementary to literals in F. Then F’ contains w; 4 w, for all i # j. Next 
ensure that x # w; is a conjunct for 1 <i < p, as follows. Suppose that F”’ 
contains the conjunct x = w; for some 1 <7 < p. 

There is clearly at most one interpretation of x that is equal to interpretation 
of w;, so if B(F’) € {2,3,...,k,(k+1)*} then F and F” are contradictory and 
the entire conjunction is False, so assume 3(F”) € {0,1}. For the same reason, 
=v. F’ is equivalent to dz.F’, so if 3(F’) = 1, then replace x with w; in 


WwW 
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F” giving a GCCAT formula F” such that FV(F”) = FV(F). By definition of 
GCCAT formulas, either F and F” are equivalent, so F A (Av.F”) ~ F, or F 
and F” are contradictory, and the entire conjunction is False. 

Assume therefore that x 4 w; occurs in F” for all 1 <i < p. This means that 
F’ is a GCCAT formula. Because FV(F’) = {a,wi,...,wp} and F’ does not 
contain a literal complementary to a literal from F’, eliminating from F’ atomic 
formulas that occur in F' yields an element of exts(F, x). 

To ensure that there exists exactly one conjunct of the form 4*a. F’ for each 
F' € exts(F, x), use the fact that the k +1 formulas J=*z. F’, for 0 <i<k, and 
42k+ly. F’ form a partition (they are mutually exclusive and their disjunction 
is True). 


Lemma 13. If e is an environment for language L, C' a counting star formula 
in language L, and m € {{1}, {2}, {1, 2}}, then [C]e = S,,[C]e”™. 


Proof. Formula E contains only equalities, so [E]e iff [E]Je”’. It therefore suffices 
to show that 


[K[F] ® *,,[3°'2.F{] ®...® X,[A°*2.Fy]e™” = True (4) 


iff [Fe = True and for all 2, [S*‘a.F/]Je = True. 

=): Let (4) hold. Then there exist e9, €1,..., ex such that split e™ [eo e1 ... ex], 
[K[F ]]eo = True, and [%,,, [5° 2.F/]e; = True for 1 <i<k. 

We first show [F']e = True. Note first that [Gz]e; = True for 1 <i < k. 
Namely, because both (F’)*, and (F’)m entail Gz, so does %,,[5°2.F/], by 
definition of X,[]] and split. Therefore, e9 is the only environment among 
€0,€1,---,€x that may have non-empty relations between the elements inter- 
preting 21,...,%p. As a result, [FJe™ = [F]eo. But [Feo = True because 
[KF ]]eo = True. Therefore [F]Je” = True, and F contains no symbols from 
L'\ L, so [Fe = True. 

We next show [A%z.F/Je = True for 1 < i < k. For s; = pt‘, from 


[Vn [5% 2.F/JJe; = True we have that there exist e;,9,é€:,1,...,€i,» such that 
1) split e;[€:,0, €:,1,--+ 5 Cp]; 2) [(F')*, J e:,0 = True, and 3) [F') mex; = True 
for 1 < gj < p. Similarly, for s; < p, we have that there exist €;1,...,€i,s, 


such that 1) splite;{e;i1,...,¢:,s,], and 2) [(F") mle: = True for 1 < 7 < 5. 
Note that whenever [(F’)*,Je:,; or [(F’)mlei,; holds, we can split elements of 
the domain D into two disjoint sets: elements E;,; for which empExg(F, 2) 
holds, and elements N;,; for which F’ A Mark,,(x) holds. If [(F’)mJe:,;, then 
|Ni,;| = 1, by definition of [(F")ml]e:,;. Moreover, by definition of split and 
because m # 0, we have Nj, 5,9 Nizj. = for (i1,91) A (é2,J2). Observe 
that, for a given domain element d € D, the atomic type extension correspond- 
ing to e™ with « + d is the union of atomic type extensions corresponding 
to each e;,;. The atomic type extension for d in e;,; is either F’ A Mark,,(x), 
or empExg(F,x). Therefore, the atomic type extension for d in e™ is either 
F’ 4 Mark,, (x) if d € N;,; for some i,j, or empExg(F, 2) if for all i,7, d € Ni;. 
If Ni = {d| [FiJe™[z +> d] = True}, then N; = WY, Nij. If s; = k < p then 
Nel = 0521 1Nasl = CL; 1 = si, so [A*e. FiJe™ = True. Because J-*e. Fi 
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is formula in language L, we have [4>*z. F/Je = True. Similarly, if s; = p*, 
then |Ni| = |Ni,ol + 04-1 [Nis] = |Niol +p = p, so [s2* a. F/je™ = True and 
therefore [32?x. F/]e = True. In both cases, [5%'x. F/]e = True. 

This completes one direction of the implication, we next show the converse 
direction. 

<): Let [FJe = True and for all « where 1 < i < k, [A*a.FiJe = True. 
We construct environments €o,€1,...,ex such that 1) splite™ [eo, e1,..., ex] 2) 
[K[F ]]eo = True, and 3) [A,[5*2.F/]Je: = True for all i where 1 <i < k. 
We construct €9,€1,.-.,¢x% by assigning the tuples of relations in e to one of 
the environments eo, €1,...,€x, as follows. We only need to decide on splitting 
the tuples (d;,...,dq) where all but one value d,...,d, are from the set Dx = 
{ex1,...,€%n}, the values of relations on other tuples do not affect the truth 
value of formulas in question and can be split arbitrarily. If {di,...,dq} C Dx, 
then we assign the tuple to eg, as aresult, [KF ]]leo = True. If {di,...,dg}\Dx = 
{d}, then let i be such that FY is the unique extension of F' with the property 
[FiJel[z > d] = True. Then assign the tuple (di,...,d,) to the environment e; 
and also assign the values (e B;)d for all 1 € m to e;. Because we assign each 
relevant tuple to exactly one e;, we ensure split e™ [eo, e1,..., ex]. Let De = {d | 
[F’Jela — d] = True}, then also Dg = {d | [FiJe:[x -— d| = True}. Because 
[a%v.FiJe = True, |Dz| = s; for 5; < p and |Dg| > p for 5; = p*. Let 5; < p. 
Then split e; into e;1,...,¢€:,5, by assigning exactly one element d € Dg to one 
e;,;- When assigning an element we assign the values of all relations from L, as 
well as the relations B; and By. This ensures that [(FY)mlJe:,; = True for all 
1 <i < s;. For s; = p*, we split e; into e;0,¢:,1,...,¢i,» by assigning exactly 
one element to each of e;,1,...,€i,) and assigning the remaining elements to e€;,9. 
In both cases, we obtain [Ay [5° 2.F/]Je: = True. 


Lemma 15. If Gi, ~ Go, then Gz => G; is valid. 


Proof. We show the claim for each of the rules (1)—(6). 

Rule (1): Let T; © Tp be defined and let [(Zi © T2)1,2]e = True for an L’- 
environment e. Let d € D be the unique domain element such that [T) ©T2]e[a 
d| = True. Let e; and eg be such that splite [e1, €2], [TiJei[z + d] = True and 
[T2]e2[z +> d] = True, and e,Byd = True iff p = q for p,q € {1,2}. In other 
words, e; and €2 split e by assigning tuples validating T) to e,, tuples validating 
T2 to eg, and by assigning B, to e; and By to eg on the element d. The values 
of relations er containing tuples with an element d’ ¢ {ex1,...,e%n,d} are all 
False, because [ (71 © T>)1,2]e = True, so we let the values of er and egr for those 
tuples also be empty. Then d is the only element outside {e1,..., ev, } such that 
[Ti]Je1[x +> d] = True, and d is also the only element outside {ex,,...,e%,} such 
that [T2]e2[% + d] = True. As a result, [(Ti)iJe1 = True and [(T>)2]Je2 = True, 
so [(Ti)1 ®(T2)a]e = True. 

To show the claim for rules (2), (3), (4), we proceed similarly as for rule (1). 

Rule (2): Let T; © Tz be defined and let [(T; © T2)1,2 ®(To)3Je = True. 
Then there are e’ and e” such that splite[e’,e”], [(Z1 © To)1,2]Je’ = True and 
[(Z2)sJe” = True. Let d be the unique element such that [T) © Taje’|z@ 
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d| = True, and let di,...,d, be the list of all (distinct) elements such that 
[(Ta)5]e”[a — d;] = True. Note that d ¢ {d1,...,dx}, because e’Bod = True, 
e” Bod; = True for all 1 <i < k, and splite [e’,e”]. We construct e, and e2 such 
that split e [e1, e2] as follows. We assign By, as well as the values of relations that 
hold according to T; on element d to e1, and we assign Bz, as well as the values 
of relations that hold according to Ty on element d to e2. We assign Bz as well as 
the values of relations that hold according to T; on dj,...,d, to eg. The values 
of B, and the relations on d,,...,d, for e; are empty. For such e; and eg we 
have [(Ti)iJe1 = True and [(T2)5]e2 = True, so [(Ti)1 @(T2)5]e = True. 

Rule (3) is analogous to rule (2). 

Rule (4): Let T; © T2 be defined and let [(T1)j @(T2)3 ®(T1 © T2)j 2] = True. 
Then there are e’,e”, e’” such that split e [e’, e”, e’”], [(Ti)jJe’ = True, [(Z2)3]e” = 
True, and [(71 © To) 2]Je”” = True. Then there are three sets of elements N’, N”, 
N’", where N’ contains elements that validate T, in e’, N” contains elements 
that validate T> in e”, and N”’ contains elements that validate T; © T> in e”. 
We have N’O N” = § and N” NN” = 6, whereas N’M N” need not be 
empty. Each element d ¢ {ex1,...,e@n} validates in e either 1) empExg(F, x), if 
dé N'UN" UN", or 2) Ty, ifd € N’\.N", or 3) To, ifd € N"\.N’, or 4) T, OT», 
ifd € (N’NN”)UN". We construct environments €1,€2,e3 by assigning B, and 
relations from T; to elements in N’ \ N” to e1, assigning Bz and elements in 
N’'\ N" to eg, and splitting relations on elements in (W’N N”) UN” into those 
for T,, which we assign to e;, and those for Tz, which we assign to eg. We then 
have [(T1){]Je1 = True and [(T2)5]e2 = True, so [(Ti){ @(T2)5] = True. 

Rules (5), (6): Directly from the definitions of empe and (F')*, it follows that 
empe => (F’)*.. 


Lemma 17. (Va(c,,0,,c3) C3) > (C1 ® C2) is a valid formula for every pair of 
counting star formulas C) and C4. 


Proof. Let [Va(c,,c2,Cy) C3]e hold for some L-environment e. Then [C3]e = 


True for some C3 such that S[Ci] ® S2[C2] = $, [Cs]. By Lemma 16, 
§1,2[C3] = Si [Ci] ® S2[C2] is valid. By Lemma 13 and [C3]e = True, we have 
[Si,2[Cs]Je’? = True. Therefore, [Si [Ci] ® S2[C2]Je’? = True. This means 
that there are e; and eg such that split el? [e1,e2], [SifCi]Je: = True, and 
[S2[C2]e2 = True. From Lemma 13 we have [Ci ]é7 = True, and [C2]éz = True. 
From split e'? [e1, e2] it follows that split e [eé7, 2], so [C1 ® Cae = True. 


Lemma 18. C,@®C2 => V R(C1,C2,Cs) C3 is a valid formula for every pair of 
counting star formulas C; and C2. 


Proof. Let [Ci ® C2]e = True for some L-environment e. Then there are e; and 
e2 such that splite[e1,e2], [CiJe1 = True and [Co]Je2 = True. By Lemma 13, 
Si[CiJet = True and S2[C2]e3 = True. We construct Si,2[C3] such that 
Si [Cy] ® S2[C2] =S S1,2[C3] and [C3]e = True, as follows. 

Let Ky be the GCCAT part of C; and let K2 be the GCCAT part of C9. Let 
Dx = D\{em,...,€%n}. For each d € Dx, let T/ be the type extension induced 
by d in ey, that is, let T? € exts(K1, x) be the formula such that [TéJet[2 — d] = 
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True. Similarly, let T4 € exts(K2,x) be the formula such that [T#]e3[z — d] = 
True. Because split e [e1, e2], the operation T, ©T> is defined and [T, ©T2]J et? [2 
d| = True. Because S;[C]e; = True, with each d we can associate an occurrence 
pi(d) in S,[Ci] of a formula F,,,(¢) where F,,(q) is of the form (T¥)1 or of the 
form (Tf)}, and an environment €1,,,,(a) such that: split et [e1,0, (€1,4(4))(d)]; 
such that K[A,]e1,0 = True, and such that for every d, [F,.,(a)J€1,.,(a) = True. 
Analogously, for each d we can associate an occurrence }i2(d) in S2[C2] of a 
formula F,,,(q) of the form (T%)2 or of the form (T$)3, and an environment 
€2,19(a) Such that split e3 [e2,0, (€2,19(d))uo(a)], Such that K[K2]e2,0 = True, and 
such that for every d, [Fy,5(a)]€2,12(a) = True. 

We compute C3 by first combining KA ,] and KK] into KLK, @ Ko]. 
From split e [e1, e2] we conclude that the operation F, © F is well-defined and 
that [K[F, © Fo]jeg” = True where e4” is given by split 9’ [e1,0, €2.0]- 

We next apply rules (1)—(4) in Figure 7, as follows: 


1. apply rule (1) once to each pair of occurrences j;(d) and p2(d) if they are 
of the form (77); and (T#)2, respectively; let (d) be the occurrence of the 
resulting formula F(a) = (Tf © T3)1,2; 

2. apply rule (2) once to each pair of occurrences j11(d) and p2(d) if 1 (d) is an 
occurrence of the form (77); and p2(d) is an occurrence of the form (T¥)3; 
let (d) be the occurrence of the formula F(a) = (Tf © T$)1,2 obtained as 
one of the results; 

3. apply rule (3) once to each pair of occurrences j11(d) and p2(d) if fui (d) is an 
occurrence of the form (T7)* and p2(d) is an occurrence of the form (T¥)2; 
let ju(d) be the occurrence of the formula F),(a) = (Tf © T#)1,2 obtained as 
one of the results; 

4, apply rule (4) once for each pair of occurrences of formulas of the form (T))+ 
and (T¥)3; for each d such that j1;(d) is an occurrence of (T¢)* and p2(d) is 
an occurrence of (T¥)3, let (d) be the occurrence of the resulting formula 
Fa) = (Tf © T3032. 

Note that no rule is applied twice to a distinct pair of occurrences of formulas. 
This means that the number of applications of rules is uniformly bounded, de- 
spite the fact that there is no bound on the size of the model e. In particular, 
there is no bound on the number of elements d covered by a single application 
of rule (4). Each formula of the form (7); is F),,;a) for some d and each formula 
of the form (T\)2 is F(a) for some d, and all such formulas are consumed by 
applications of rules (1)—(3), so the resulting formula has no subformulas of the 
form (T)1 or (T)2. After applying rules (1)—(4), apply rules (5) and (6) to all 
applicable formulas. The resulting formula F’g has no occurrences of (T')} or 
(T)3 either, it contains only occurrences of formulas of forms (71,2 and (7))j »- 


For each of the finitely many occurrences p(d) in Fg we construct eee 


splitting e!? into the environment ej” defined above, and the environments 
2: . . . . 1.2 1,2 . 
Cniay by assigning the type extension of d in e’’* to ena): By construction, 


split eb? [eG’”, (ea) )uta)l- To show [Fr]e? = True, it suffices to show 


[F-Jec? = True (5) 
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for every occurrence c = p(do). Fix an occurrence c, and let 6 = {d | (d) = c}. 
By definition of e!:?, the type extension induced by each d € 6 in e)? is TAO TY, 
and the type extension of each d € Dx \ 0 is an empty extension. Therefore, 
[(T? © TZ); Jee? = True. If F. = (Tf OTY)},. then the equation (5) already 
holds. If F. = (Tf © T¥)1,2, then F. was generated by one of the rules (1)—(3), 
which means that 6 is a singleton set. Namely, if F. was generated by rules (1) or 
(2), then there is exactly one d such that j41(d) = c, namely do, and similarly if Fy 
was generated by rule (3), then there is exactly one d such that f2(d) = c, again 
dy. In both cases, 5 = {do}, so do is the unique d with type extension Tf © TY, 
which means that [(T? © T¥)1,2Jet? = True and the equation (5) holds. 

We finally apply idempotence to ensure that no (7)*, occurs more than 
once. The resulting formula Fp, is equivalent to Fr, so [Fp]Je+? = True, Fp 
is of the form S12 [C3], and S) [Ci] ® Sg[Co] = S, o[Cs]. From S12 [C3] we 
recover C’3 using the inverse of the translation in Figure 6. By Lemma 13 we 
have [C3]e = True, completing the proof. 
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